A universally programmable Quantum Cellular Automaton 
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We discuss the role of classical control in the context of reversible quantum cellular automata. 
Employing the structure theorem for quantum cellular automata, we give a general construction 
scheme to turn an arbitrary cellular automaton with external classical control into an autonomous 
one, thereby proving the computational equivalence of these two models. We use this technique to 
construct a universally programmable cellular automaton on a one-dimensional lattice with single 
celi dimension 12. 
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Many architectures have been proposed for the con- 
struction of a quantum computer. While the earliest al- 
gorithms were considered in a model based on unitary 
gates, recent years have seen ideas like the one-way quan- 
tum computer, in which the non-unitary acts of measure- 
ment play the key role, or adiabatic computing, in which 
the continuous time dynamics plays a key role. It is quite 
possible that substantially new ways quantum computa- 
tion will surface. 

Studies showing how different quantum computational 
models can simulate each other are valuable in construct- 
ing universal paradigms. They also provide perhaps the 
clearest expression of the primitives in each computa- 
tional model that are responsible for generating computa- 
tional power stronger than that of classical computation. 
Since the major obstacles against quantum computation 
will for a long time be difficulties in implementation, a 
further incentive for such alternatives is to generate ideas 
of how to adapt the computational model to an available 
set of primitives. 

In this paper we chiefly consider two different compu- 
tational models, both of which are quantum cellular au- 
tomata (QCAs), i.e., distributed systems of lattice celis 
with a spatially homogeneous discrete time dynamical 
evolution of strictly finite propagation speed. We use the 
definition for QCAs developed in Q]. This differs from 
an older definition given in which is not consistent 
with the iteration of dynamics (see chapter V in 0]). 

Our two models diffcr from each other in the way the 
program operates, or more precisely how the quantum 
part of the computer interacts with a classical controller, 
being somewhat analogous to the gate model and the 
Turing machine model respectively. In the gate model 
the classical controller has to be very powerful: on re- 
ceiving the input, it will compile the program in a ver- 
sion adapted to the size of the input, and actually build 
a quantum circuit to run it. The fiexibility of this model 
hence largely resides in the classical controller, and the 
quantum computer hardware is, in principio, scrapped 
after each run. 



In contrast, a classical universal Turing machine takes 
its fiexibility from the possibility of writing both the pro- 
gram and the input data on its tape for initialization. We 
can apply these ideas to running a quantum cellular au- 
tomaton as a computer: on the one hand, we can use a 
classical compiler to select a classical sequence of opera- 
tions each of which is a QCA time step in its own right. 
Such a machine will be called a classically controlled QCA 
(ccQCA). On the other hand, we can insist that program 
and data are written into the system by the initial prepa- 
ration, after which the machine runs autonomously for a 
fixed number of steps, and a fixed transition rule inde- 
pendent of the program. The only role of the classical 
controller is then final measurement. 

It is entirely possible that the absence of classical 
signalling from this second model (except at initialisa- 
tion and readout,) coupled with temporal translational 
symmetry, will prové to have the pragmàtic value that 
an implementation can be more readily isolated from 
decoherence channels while it is 'running' its program, 
thereby enabling lengthy computation without explicit 
error-correction. 

Our aim is to show that these two ways of program- 
ming a QCA are computationally equivalent. In the proof 
we use the structure Theorem for cellular automata ob- 
tained in Q . We then use this equivalence to build a uni- 
versal autonomous QCA, with an explicitly given transi- 
tion rule, where "universal" means that it simulates the 
gate model up to polynomial overhead. This construc- 
tion employs a significantly smaller celi size than that of 
other similar machines discussed in the literature, [3, 3 • 

The structure Theorem holds in any lattice dimension, 
and so do the ideas of our construction, but we stick to 
the one-dimensional case as it is sufficient for bounded 
error quantum probabilistic computation, (BQP). The 
advantage of using just one lattice dimension has also 
been stressed by some authors || for practical engineer- 
ing concerns. However, first implementations of highcr 
dimensional lattices already exist Q. 
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General construction techniques 

From gate model to ccQCA 

Consider the gate model of quantum computation 
wherein qubits are present in a 1-dimensional lattice, 
and any gate may act unitarily on just two neighbour- 
ing qubits. Such models are universal in the sense of 
BQP computation 0|. Then there are various direct 
ways of implementing such circuits as classically con- 
trolled QCAs: for example, one could envisage a data 
band for encoding the qubits of a circuit, and a pointer 
band for encoding a single data-pointer. The transfor- 
mations of the ccQCA could manipulate the location of 
the pointer and then use that pointer to break the spa- 
tial symmetry of the dynamics so that individual specific 
neighbouring qubit pairs may be addressed, as required. 
The data band and pointer band can of course be re- 
garded as one band, by interleaving their qubits. 



From ccQCA to QCA 

We now give a construction, which turns an arbitrary 
ccQCA with finite instruction set into a QCA without 
classical control. The design of the data band of the 
ccQCA will be retained, and its programs will be encoded 
in an additional band of quantum celis. 

The main tool needed for the decomposition is the 
QCA structure theorem (Theorem 6 in jlj). This the- 
orem guarantees the existence of two unitary opera- 
tions (Ui and Vi) for each of the ccQCA transition rules 
which implement the time evolution by sequential appli- 
cation to non-overlapping neighbourhoods (as indicated 
in FIG.nj. The structure theorem applies directly to 
nearest neighbour ccQCAs. To apply it to a ccQCA with 
larger neighbour hood, one needs to group adjacent celis 
of the ccQCA into "super-cells" such that one ends up 
with a nearest-neighbour interaction for the super-cell 
structure. Note that the interaction is not changed by 
this way of reorganizing the QCA, and the neighbour- 
hood of the regrouped ccQCA is only slightly enlarged 
to become a union of super-cells. In any case, the analy- 
sis can be restricted to nearest neighbour automata. 

Consider an autonomous QCA consisting of a data 
band representing the one-dimensional lattice of the 
ccQCA being simulated and a program band containing 
information about the sequence of transition functions 
that the ccQCA would apply. Let the ccQCA contain 
k different homomorphisms. Then the celi size of the 
program band is chosen to be 2k + 1, enough to distin- 
guish the 2k unitàries, allowing for an extra symbol rep- 
resenting the identity map. Let the time evolution of the 
QCA be the product of a 'shift step' shifting the program 
band two celis past the data band, followed by a 'calcula- 
tion step' performing the required unitary maps on pairs 
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FIG. 1: Two timesteps of the ccQCA. The clocks indicate 
the the time before and after application of the unitàries for 
comparison to FIG. 2. 



of celis of the data band, each controlled by the neigh- 
bouring contents of the program band. As the program 
band moves past the data band, each data celi (qudit) 
undergoes the time evolution of the ccQCA being simu- 
lated, yet it should be noted that different timesteps in 
the ccQCA evolution are present at one timestep of the 
autonomous QCA (see FIG.J21). I n accordance with the 
definition of QCA, it is important that there arises no 
possibility of non-commuting unitàries operating on the 
same celi at any time; note that the unitàries Ui and Vi 
obtained from the QCA structure theorem work on dif- 
ferent combinations of odd and even celis (see FIG.^I. 
Therefore, to circumvent this possibility, one can design 
the autonomous QCA such that the localisation regions 
of the unitàries are separated by one idle celi, as depictcd 
in our example. 

This construction gives an autonomous QCA which, 
sincc its dynamics must by definition be (spatially) trans- 
lationally symmetric, has celis composed of three celis of 
the original ccQCA plus one celi from the program band; 
and it turns out to have only nearest-neighbour interac- 
tions. This general construction scheme can be optimized 
in an explicit situation to reduce the large cell-size. Next 
we give such an explicit construction by starting from a 
universal ccQCA with homomorphisms that already have 
sequential structure, so the Margolus decomposition can 
be omitted. 



Explicit construction 

Construct circuit of controlled partial a y 

There is a two-qubit gate which, if not constrained 
to act always on neighbouring qubits but allowed to act 
on qubits within arbitrary range, serves as universal for 
computation within the Standard gate model. This gate 
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FIG. 2: Four timesteps of the autonomous QCA. As the celis 
are updated sequentially, the clocks indicate the local time of 
the celis corresponding to the time of the ccQCA. 
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in the computational basis, { |00), 1 01 ) , |10), |11) }. It per- 
forms a 7r/4 rotation on one qubit conditioned on the set- 
ting of anothcr. To sec that this gate is universal, it suf- 
fices to show that by use of simple (computational-basis) 
ancillae, one can simulate both the Hadamard and the 
Toffoli gates, according to the well-known results of 0. 



Construct qubit ccQCA 

Consider a ccQCA on a 1-dimensional qubit lattice 
that allows the use of four different kinds of QCA- 
homomorphisms as described below. To show how this 
ccQCA can be used to simulate an arbitrary gate model 
circuit whose gates are of the kind G J1J, we will think 
of the ccQCA's qubits as belonging to three interleaved 
1-dimensional lattices, and label them accordingly as 
di, ai, hi with We will load the d-band with input 

corresponding to the input of the gate array being sim- 
ulated, initialising its unused qubits to |0). The a-band 
will be used as 'ancilla space' and should be initialised 
to |0) everywhere. The /i-band is used to break spatial 
symmetry of the dynamics, and should contain a single 
'pointer' |1) with the rest of its qubits containing |0). 



h 
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FIG. 3: Structure of the 'three band' QCA with pointer, data 
and ancilla band. Localisation of the homomorphisms are 
indicated. 



The four famílies of homomorphisms we allow are 
Ai = JJ G(h x ,a x+i ), Bi = J[ G(h x ,d x+l ), 



xeTL 



(2) 



(Cx; d x +i), 



Ci= Y[G(d x ,a x+i ), D t =Y[G(c 

igK xe% 

(see FIG|3J) and we simulate the gate G(di,dj) by the 
classically controlled sequence (reading right to left) 

A 7 ■ CU ■ Bf ■ CU ■ D 3 ■ C 7 _i ■Bf·C- i -A = G{d % ,d,). (3) 

The reader may check that this effects the required trans- 
formation on the d-band, restoring the other two bands 
to their original configurations, assuming the initialisa- 
tions described above. Of course, a clever compiler would 
find ways of simulating a given circuit that are far more 
efhcient than repeated application of this techniquc. 

Construct nearest neighbour qubit ccQCA 

The ccQCA described above employs operations 
with arbitrarily large neighbourhood. With additional 
neighbour-swap operations we can first move two bands 
to the required interband-distance i, then apply a cellwise 
G-operation (A , B 0i Cq, or D Q ) and finally shift back, 
in order to implement all operations J2J with a nearest 
neighbour ccQCA. Moreover, the interband G-operations 
are quite similar, which suggests to interleave the three 
bands into a single qubit band labelled , i G TL 

(..,q-2,q-i,qo,qi,q2, •■) = (..,a-i,h-i,d ,a ,ho, ..). 
A sufhcient set of operations is then given by 
Ei = J\Swa,p(q 3x+l ,q 3x+i+ i), 



Fi = 



i,q3x+i+i), 



(4) 



for i G { 0, 1, 2 }. It should be noted that it suffices to 
move two bands relative to each other, since the homo- 
morphisms of (|2Jl act non-trivially on only two bands at 
a time. 
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FIG. 4: Structure of the autonomous QCA. Indicated are 
three celis with one program qutrit and two data qubits each; 
identification of the data qubits corresponding to FIG. [3] 

Construct universal nearest-neighbour QCA 

The homomorphisms of the ccQCA already work on 
non-overlapping neighbourhoods, and so there is no need 
for further Margolus decomposition here. Next we offer a 
design which further minimizes the Hilbert space dimen- 
sion of the celis of which the QCA will be composed. 

Take a 1-dimensional lattice of qudits labelled Ci (for 
i £ %) of singlo celi dimcnsion d = 12, and regard these 
as incorporating one qutrit celi of the program band ti 
with two qubit celis qn and qn+\ from the data band. 
The celi Ci we define explicitly as the tensor product 

Ci = ti <g) q 2 z ® Q2Í+1- (5) 

(Identification of data celis is indicated in FIG.0J) As be- 
fore, it is not necessary to have any 'fine control' over the 
relative motion of the two sub-bands; rather we simply 
allow one to pass by the other with an invariant velocity. 
This is again achieved by decomposing the QCA trans- 
formation step into two parts, a unitary and a shift: 

U: (D 12 — » (D 12 acting on every celi simultaneously, 
S: ( ^ í+1 ) sliding the bands relatively. (6) 

To simulate the nearest-neighbour ccQCA, we will in- 
tèrpret the data band g, exactly as before, but the pro- 
gram band U must be initialised so as to execute the 
appropriate transformations on the data band as the two 
bands slide past one another. At initialisation, the celis 
i > will be used to hold the non-zero content of the 
data band in their qubits, while the celis i < will be 
used to hold the program band in their qutrits. We will 
initialise the in the computational basis, and the U 
operation will be defined to leave these qutrits invari- 
ant. Specifically, ti = |0) will cause no transformation, 
U = |1) will cause a swap of data between qn and qn+i, 
and ti = |2) will cause the transformation G(q2i, fe+i) 
described in QJ. 

Each of the homomorphisms of Q is simulated not 
by one QCA timestep but by three neighbouring qutrits 



sliding past all of the data-bearing qubits. Specifically, 
the program-segment |^3i Í3Í+1 *3i+2 ) = 1 100) will simu- 
late the homomorphism Eq, the program-segments |010) 
and 1 001) will simulate the homomorphisms E2 and E±. 
Likewise, the program-segments |200), |020), |002), will 
simulate the homomorphisms Fo, F2, F\, respectively. 
The celis with negative index may be initialised with 
program-segments of these kinds in order to induce the 
desired transformations on the data. The computation 
output may be read (in computational basis) any time 
after the content of the program band has moved past 
the content of the data band. 

To show that this simulation is efficient, one needs 
to estimate the necessary resources. Consider a quan- 
tum gate circuit (QGC) of SpaceQcc qubit-wires and 
TimeQGC G-gates. In the first simulation step, the re- 
sources of the ccQCA depend linearly on the correspond- 
ing QGC resources. The use of swap gates in the next 
step increases the time; the encoding of the program into 
the program band increases the space, so one ends up 
with an estimate for the autonomous QCA of 

Time Q cA = O (Time qgc ■ Space QGC ), 
Space QCA = 0( Time Q Gc • Space QGC ). (7) 

The resources depend polynomially on the given QGC, 
so the simulation is efficient. 
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